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Recently, significant growth in using online-based learning stream (i.e., e- 
learning systems) have been seen due to pandemic such as COVID-19. 
Forecasting student performance has become a major task as an institution is 
focusing on improving the quality of education and students' performance. 
Data mining (DM) employing machine learning (ML) techniques have been 
employed in the e-learning platform for analyzing student session streams 
and predicting academic performance with good effects. A recent, study 
shows ML-based methodologies exhibit when data is imbalanced. In 
addressing ensemble learning by combining multiple ML algorithms for 
choosing the best model according to data. However, the existing ensemble- 
based model does not incorporate feature importance into the student 
performance prediction model. Thus, exhibits poor performance, especially 
for multi-label classification. In addressing this, this paper presents an 
improved ensemble learning mechanism by modifying the XGBoost 
algorithm, namely modified XGBoost (MXGB). The MXGB incorporates an 
effective cross-validation scheme that learns correlation among features 
more efficiently. The experiment outcome shows the proposed MXGB- 
abased student performance prediction model achieves much better 
prediction accuracy contrary to the state-of-art ensemble-based student 
performance prediction model. 
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1. INTRODUCTION 


With the wide usage of the internet and the growth of information technology have affected the way 
academics and industries learn i.e., it is moved from the conventional offline mode to online mode namely 
the e-learning platform [1]. Especially during the COVID-19 pandemic period, all classes have moved to 
an online model, highlighting the significance of the e-learning platform. However, significant challenges 
exist in providing a reliable and accurate model to predict student performance [2]. Designing an effective 
assessment model for understanding student behavior using session streams of the e-learning platform will 
aid in improving students’ academic performance by providing personalized content. 

Personalized content delivery for improving student performance according to individual behavior 
in the e-learning platform is the major challenge of the current century [3]. Adaptive personalizing techniques 
for understanding learner profiles have been emphasized [4], [5]. Recently, data mining (DM) and machine 
learning (ML) have been used for building student performanceprediction models. The DM has been used for 
establishing useful insight from student session stream data of the e-learning platform as shown in Figure 1; 
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alongside, improves decisionmaking performance by establishing behavior patter from data [6]-[9]. Both ML 
and DM methodologies are very promising in different fields such as business, and network security 
including education. Recently, a new field has emerged namely education data mining (EDM) for enhancing 
learning style, understanding behavior, and improving student performance [10]-[13]. The EDM data is 
composed of different information such as administration data, student session stream activity, and student 
academic performance data. Here they provided an EDM dataset collected from different databases and e- 
learning systems. Here different ML models and an ensemble learning mechanism are constructed for 
predicting student performance during the course. The outcome shows ensemble model outperforms another 
model in terms of prediction accuracy [14]-[16]. However, when data is imbalanced these model fails to 
establish feature affecting the predictive model; thus, providing poor classification accuracies. The objective 
of this paper is to build an effective student prediction model for predicting student grades during the course 
through an ensemble-based ML model that works well for student session stream e-learning data [17]-[19]. 
Existing models construct ensemble learning by combining multiple ML models. However, these models are 
effective to address binary classification problems and when put forth under multi-label classification 
problems considering data imbalance, these methods exhibit poor accuracy [20], [21]. The aforementioned 
limitations motivate this research work to develop animproved student performance prediction model 
through improved ensemble methodology [22], [23]. This paper presents an effective student performance 
prediction through an improved ensemble-based ML model. First, the model briefs a detail of the ensemble 
algorithm namely XGBoost. Then, discusses the limitation of standard XGBoost when datais imbalanced. In 
addressing a modified XGBoost based student, a performance prediction model is presented [24], [25]. The 
modified XGBoost (MXGB) encompasses an improved cross-validation mechanism for establishing features 
affecting the accuracy of the student performance prediction model. Finally, an ensemble-based ML is 
constructed for building an effective student performance predictive model. Here research significance is 
discussed: i) the proposed student performance prediction model employs an efficient ensemble-based 
predictive model through MXGB, which works well even when data is imbalanced; and ii) the MXGB 
encompasses an improved cross-validation mechanism to study which feature impacts the accuracy of the 
student prediction model; and the proposed student performance prediction model achieves better receiver 
operating characteristic (ROC) performance such as accuracy, sensitivity, specificity, and sensitivity, 
precision, and F-measure comparison with the state-of-art student performance prediction model. 
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Figure 1. General design of student performance prediction through ML models 


In section 2, ML model for EDM of student session streams. In section 3, the outcome was achieved 
using the proposed MXGB-based student performance prediction model over the existing ensemble-based 
existing proposed student performance prediction model. In the last section, the significance of the MXGB- 
based student performance prediction model over the existing ensemble-based student performance 
prediction model is discussed. 


2. MACHINE LEARNING MODEL FOR EDM OF STUDENT SESSION STREAMS 

This section presents an improved ML model namely MXGB for EDM of student session streams. 
The MXGB is an improvement of the standard XGBoost by considering an effective feature selection 
mechanism. The dataset of standard EDM is defined as (1): 
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E = {(a,, bi); (ap, b), my (am, bm)} (1) 


where j=1,2,3, ..., m, outlines row size considered, bj € {—1,1} defines jt” row output, and a; defines n- 
dimension vector of self-determining features experimental of row j. In general, EDM data has diverse 
features that are multi-dimensional. Nonetheless, with fewer rows m. Thus, for studying and designing 


student performance prediction model G, for forecasting the real estimation of actual G is defined as (2): 
g:A > B (2) 


in this work modifying the feature selection process during training XGBoost through minimization of the 
objective function and effective student performance prediction model is designed as shown in Figure 2. 
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Figure 2. Proposed ML model for EDM of student session streams 


2.1. XGBoost prediction algorithm 

XGBoost algorithm is an improvised version of the gradient boosting algorithm [25] where weaker 
classifiers are combined for constructing strong classifiers for attaining better classification outcomes. Let 
consider a student session stream data E = {(y;, zj); j =1...0,y; E S”,zj E S}, which composed of o 
samples of data with n features. Let z; the predicted outcome by models as (3): 


Z = Xi- 910)  € G (3) 


where g, defines a distinct regression tree and (y;) defines the respective prediction outcome provided by the 
respective l — th tree concerning j — th sample. The regression tree g, and its function can be learned through 
the minimization of the following objective in (4). 


G = Xj- MCZ z) + Lier B (91) (4) 
In this work, m defines training loss operation for measuring variance among predicated value z; 


and the actual value zj. To avoid the over-fitting problem, the parameter f is used for penalizing the 
complexity of the predictive model as (5): 
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1 2 
B(gi) = 8U + >uI|IxI| 2 (5) 

where 6 and u define the regularization parameter, U defines the leaf size and x defines the score of the 

different leaves. The ensemble tree is constructed is through a summation process. Let 2® define the 


prediction outcome of the j — th sample considering u — th iterations, it requires to add g,, for minimizing 
the (6): 


GO = Zam (z2? + g)) + BO) (6) 
the (6) is simplified by eliminating constant parameter through second-order Taylor expansion as (7): 
To 2 
GM = Ye lhjg; (y) + zigu) ]+ BP (7) 
where h; defines the first-order gradient concerning m as (8): 


h; = 02" m(zj, 2-9) (8) 


where i; defines the first-order gradient concerning m as (9): 
po mae) (9) 
therefore, the predictive model objective parameter is expressed using the (10). 
GO = F? [hg 0) + igul) | + OU + ugt (10) 
Jat [IIJ Vj) © 9 FIuj 2h Ak=1 4k 
The simplified representation of the (10) is given as (11): 
1 
OM == Fiai (Dien hy)xy 5 Diete ty + 1) x2 | + ôU (11) 
where jẹ defines the sample set of leaf k, which is represented as (12) and (13): 
1 ; 
GM == D (Dicks hy) 5 (Ziein ij + u)x?| + ôU (12) 


jk = Ür; = KD} (13) 


where r defines the size of the tree, which is fixed, the optimal weights x; of leaf j is obtained through the 
(14). 


s Hr 
Xk = in (14) 


In addition, the respective optimal size is obtained as (15): 
u Hk 
G* Sela gg toe (15) 
where Hx is represented as (16): 
Ay = Xjej hj (16) 
similarly, I is represented as (17). 


Ik = X jej ij (17) 
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The G* defines the qualities of tree r where a smaller value indicates better tree structure. Though 
XGBoost is efficient in obtaining high prediction accuracy; however, poor feature selection under unknown 
environments or when data is imbalanced exhibit degradation of prediction accuracy. In addressing the 
research problem, an effective feature selection within training data is modeled in the next sub-section. 


2.2. Modified XGBoost prediction algorithm 

In this work, the feature selection process of standard XGBoost is modified by establishing better 
feature importance outcomes to achieve an improved prediction scheme. The feature selection process is 
improved by optimizing the cross-validation with a minimal validation error. The K-fold cross-validation 
scheme is used for optimizing the outcome of the predictive model where the dataset is randomly divided 
into K subset of equal size. Then, for constructing the student prediction model K—1 is used, and the 
remaining is used for optimizing the prediction error of the student prediction model. Lastly, the mean of the 
prediction error of different combinations. 

K is used for optimizing the cross-validation error. After that, a grid of l appropriate outcomes is 
obtained for obtaining optimal prediction that minimized cross-validation error considering feature 
importance, and the student prediction model with minimal cross-validation error is chosen. The proposed 
cross-validation scheme with effective feature selection is composed of two phases. In the first phase, the 
main feature is selected from feature subsets. In the second phase, features chosen from the first phase are 
utilized for constructing an effective student performance prediction model. The traditional single-fold cross- 
validation error is constructed as (18): 


Cv) = Yk nr (b) 95° 0,0) (18) 


however, the above equation does not identify which feature affects the accuracy of the predictive model. In 
addressing this work an effective cross-validation with effective feature selection with high importance 
affecting prediction accuracy is modeled as (19): 


i peT 
CV(0) = SYS Vhs Dicey P (by 92" 0,09) (19) 
in (19), selecting ideal ô for optimizing the student prediction model is attained as (20). 


6 = argmin CV,(o) (20) 
o E {04,..., dz} 


In (19), M defines the size of the training dataset considered, (-) defines the loss function and gF ) () 
defines a function to compute coefficients. The (19) is executed iteratively for constructing the best student 
performance prediction model (i.e., its optimization of training error is done in the first phase; the parameter 
is passed onto the second phase to understand and update the feature importance characteristic into the 
predictive model. The optimization process to obtain effective features is obtained through the minimization 
process of objective function employing gradient decent mechanism. The effective feature is selected 
employing the ranking method (-) for constructing a student performance prediction model through the (21): 


0 if nj is not selected 


ra} = f if nj is selected as optimal prediction model j = 1,2,3,...,n (21) 
the feature subset is constructed as (22): 

F, = {r ) ru), -o PM), (22) 
the ideal feature with maximum score considering varied K-folds instance is obtained as (23). 

Foe = {r(my), r (na), -r n), (23) 


Then, compute the number of occurrences a particular feature is selected for K feature subsets 
having maximum score and the final feature subset is obtained as (24): 


Fs final = fs (m), fsa), o fMn), (24) 
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where (-) depicts a case when where n*" feature is selected/not and mathematically represented as (25). 


Oif qj is chosen lesser than $ time, j = 1,2,3,...,n 


F(a) = (25) 


1 ifqj is chosen greater or equal to~ times, j = 1,2,3,...,n 


The aforementioned equation is used for the generation of a subset of n' selected features, where nt 
describe how many times a feature is selected. The enterprise performance management (EPM) training data 
utilized is a subset through selected features for building an effective student prediction model. To reduce 
randomness during the training process, K —folds are built by iterating S number of times in the first phase. In 
the second phase, for reducing variance subset of features is selected. Therefore, the proposed MXGB-based 
student performance prediction model significantly improves overall prediction accuracy in comparison with 
state-of-art ML-based student performance prediction schemes. 


3. RESULT AND ANALYSIS 

In this section, student performance prediction using the proposed MXGB and other existing ML- 
based student prediction methods are studied [22]. The e-learning dataset from [22] is used for performance 
analysis. The selection of the dataset is based on a comparison paper [22]. The model is a ML model for 
performing student performance prediction implemented using the Python 3 frameworks. The ROC 
performance metrics such as accuracy, sensitivity, specificity, precision, and F-measure are used for 
validating the student performance prediction model. The accuracy is computed as (26): 


TP+TN 


accuracy = —————_ 
y TP+FP+TN+FN 


(26) 


where TP defines true positive, FP defines false positive, TN defines true negative, and FN defines false 
negative. The sensitivity is computed as (27): 


Fr oo aP 
Sensitivity = N (27) 
the specificity is computed as (28): 
Vie d CEN 
Specificity = TED (28) 
the precision is computed as (29): 
Precision = —— (29) 
TP+FP 
the F-measure is computed as (30). 
Powe = 2xPrecision Sensitivity (30) 


Precision xSensitivity 


3.1. Predictive model performance evaluation 

In this section different ML-based student, performance prediction model in terms of specificity and 
sensitivity is studied. Figure 3 shows the specificity outcome achieved using different student performance 
prediction models such as random forest (RF), logistic regression (LR), and ensemble-based [22]. XGBoost- 
based, and proposed MXGB-based. The RF-based attain a specificity of 0.875, the LR-based attain a 
specificity of 0.75, ensemble-based attain a specificity of 0.857. XGBoost-based attain a specificity of 
0.8502, and the proposed MXGB-based attain a specificity of 0.946. A higher value of specificity i.e., closer 
to 1 is considered a good prediction model. Thus, the proposed MXGB-based student performance prediction 
model is much more efficient than other ML-based student performance prediction models in terms of 
specificity. Figure 4 shows the sensitivity outcome achieved using different student performance prediction 
models such as RF-based, LR-based, and ensemble-based. XGBoost-based, and proposed MXGB-based. The 
RF-based attains a sensitivity of 1, the LR-based attains a sensitivity of 0.857, ensemble-based attains a 
sensitivity of 0.857. XGBoost-based attain a sensitivity of 0.9449, and the proposed MXGB-based attain a 
sensitivity of 1. A higher value of sensitivity i.e., closer to 1 is considered a good prediction model. Thus, the 
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RF-based proposed MXGB-based student performance prediction model is much more efficient than other 
ML-based student performance prediction models in terms of sensitivity. However, the MXBG-based brings 
tradeoffs between higher sensitivity and specificity; thus, attaining much better student performance 
prediction accuracies. 

Further, performance is validated considering different ROC metrics such as specificity, recall, 
accuracy, precision, and F-measure using different predictive models as shown in Figure 2. From Figure 2, 
we can see the factor analysis based XGBoost (FA-XGB)-based predictive model achieves much better 
performance in comparison with XGBoost and ensemble-based predictive model. Figure 5 shows the ROC 
performance of different ML-based student performance prediction models. 
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Figure 4. Sensitivity performance of different ML algorithms for predicting student performance 
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Figure 5. ROC performance of different ML-based student performance prediction models 
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3.2. Feature importance performance 

Figure 3 shows a graphical representation of the feature importance parameter obtained using 
XGBoost and FA-XGB-based predictive model. From Figure 3, we can see that FA-XGB gives higher 
importance to features in comparison with XGBoost. Further, the FA-XGB-based predictive model gives 
importance in the following order Kolmogorov-Smirnov (KS), weight (WT), majorization-minimization 
(MM), moving window (MW), machine learning-based checker (MLC), machine reading comprehension 
(MRC), and moving window classifier (MWC). On the other side, the XGB-based predictive model gives 
importance in the following order WT, KS, MW, MM, MRC, MIC, and MWC. Further, it is noticed in both 
cases MWC is given very less importance. Figure 6 shows how selecting the right feature aid in improving 
the overall classification accuracy of the proposed FA-XGB-based predictive model. 


Feature Importance Study 
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Figure 6. Feature ranking score graphical representation 


3.3. Student performance prediction for a different session 

Here the performance is validated considering different ROC metrics such as specificity, recall, 
accuracy, precision, and F-measure for different sessions such as session 2, session 3, session 4, session 5, 
and session 6 using a different predictive model such as XGBoost and FA-XGB as shown in Figures 4 to 8, 
respectively. Figure 7 shows the accuracy performance using ML-based student performance prediction 
model for different sessions. From Figures 4 to 8 we can see the FA-XGB-based predictive model achieves 
much better ROC performance in comparison with the XGBoost-based predictive model. Figures 9 to 11 
show the specificity, precision, and F-measure performance using an ML-based student performance 
prediction model for different sessions. 
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Figure 7. Accuracy performance using ML-based student performance prediction model for different sessions 
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Figure 8. Sensitivity performance using ML-based Figure 9. Specificity performance using ML-based 
student performance prediction model for different student performance prediction model for different 
sessions sessions 
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3.4. Feature ranking importance 

The graphical representation of the feature ranking score of the XGBoost-based and MXGB-based 
student performance prediction model for different sessions is shown in Figures 12 to 16. Figure 11 shows 
the graphical representation of the feature ranking score attained using XGBoost-based and MXGB-based 
student performance prediction model for session 2. From the result it can be stated that XGBoost-based 
gives a higher score for MW and a lesser score for MRC; On the other side, MXGB-based gives a 
higher score to WT and a lesser score for MRC. Figure 12 shows the graphical representation of the feature 
ranking score attained using the XGBoost-based and MXGB-based student performance prediction model for 
session 3. From the result, it can be stated both XGB-based and MXGB-based give higher scores for MM and 
lesser scores for MWC; however, the MXGB-based model gives much higher feature importance in 
comparison with XGBoost-based student performance predictions. Figure 13 shows the graphical 
representation of the feature ranking score attained using the XGBoost-based and MXGB-based student 
performance prediction model for session 4. From the result, it can be stated that XGBoost-based gives a 
higher score for MW and a lesser score for KS, MWC, and MRC; On the other side, MXGB-based gives a 
higher score to MM and a lesser score to MWC. Figure 14 shows the graphical representation of the feature 
ranking score attained using the XGBoost-based and MXGB-based student performance prediction model for 
session 5. From the result, it can be stated that XGBoost-based gives a higher score for KS and WT and a 
lesser score for MW, MM, and MWC; On the other side, MXGB-based gives a higher score to KS and a 
lesser score to MWC. Figure 15 shows the graphical representation of the feature ranking score attained 
using the XGBoost-based and MXGB-based student performance prediction model for session 6. From the 
result it can be stated that XGBoost-based gives a higher score for KS and a lesser score for MLC and MWC; 
On the other side, MXGB-based gives a higher score to KS and a lesser score to MWC. The graphical 
representation from Figures 11 to 15 shows the MXGB-based gives higher importance to features in 
comparison with the XGBoost-based student performance prediction model. Thus, aiding the MXGB-based 
student performance prediction model to achieve higher accuracy in comparison with ensemble-based and 
XGBoost-based student performance prediction models. 
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Figure 12. Feature ranking score graphical representation for session 2 
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Figure 13. Feature ranking score graphical representation for session 3 
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Figure 14. Feature ranking score graphical representation for session 4 


Feature Importance Study S-5 


KS WT Mw MRC MM MLC 


MWC 


Session Streams Features 


E XGB E MXGB 


Figure 15. Feature ranking score graphical representation for session 5 


Int J Reconfigurable & Embedded Syst, Vol. 13, No. 2, July 2024: 383-394 


ISSN: 2089-4864 


Int J Reconfigurable & Embedded Syst ISSN: 2089-4864 Oo 393 
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Figure 16. Feature ranking score graphical representation for session 6 


4. CONCLUSION 

Predicting the performance of a student by analyzing the student session stream is a challenging 
task. ML algorithms have been used by various existing student performance prediction models to achieve 
improved prediction outcomes. However, these models tend to achieve higher accuracy to specific student 
data and when adapted to new data they exhibit poor performance. In addressing such issues, recent work has 
used an ensemble-based ML model for choosing the best model to perform prediction tasks. However, when 
data is imbalanced existing ensemble-based models exhibit poor performance. This paper presented an 
efficient ensemble machine-learning model by modifying XGBoost that works well even when training data 
is imbalanced. Here an effective cross-validation scheme is presented to identify which feature impacts the 
accuracy of the prediction model. The cross-validation scheme employs an effective feature ranking 
mechanism to improve prediction accuracy by optimizing the prediction error. The experiment is conducted 
using standard student session stream data. The proposed MXGB model significantly improves accuracy, 
sensitivity, specificity, precision, and F-measure performance in comparison with RF-based, LR-based, 
ensemble-based, and XGBoost-based student performance prediction models. The performance of the 
MXGB model will be tested using a more diverse dataset. Alongside this, would consider reducing training 
errors by considering multi-class classification. 
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